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Abstract 


We  propose  a  general  class  of  asymptotically  distribution-free  tests  of  a 
linear  hypothesis  in  the  linear  regression  model.  The  tests  are  based  on 
regression  rank  scores,  recently  introduced  by  Gutenbrunner  and 
Jureckova  (1990)  as  dual  variables  to  the  regression  quantiles  of  Koenker 
and  Bassett  (1978).  Their  properties  are  analogous  to  those  of  the 
corresponding  rank  tests  in  location  model.  Unlike  the  other  regression 
tests  based  on  aligned  rank  statistics,  however,  our  tests  do  not  require 
preliminary  estimation  of  nuisance  parameters,  indeed  they  are  invariant 
with  respect  to  a  regression  shift  of  the  nuisance  parameters. 
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1.      Introduction 

Several  authors  including  Koul  (1970),  Puri  and  Sen  (1985)  and  Adichie  (1978) 
have  developed  asymptotically  distribution-free  tests  of  linear  hypotheses  for  the  linear 
regression  model  based  upon  aligned  rank  statistics.  Excellent  reviews  of  these  results 
including  extensions  to  multivariate  models  may  be  found  in  Puri  and  Sen  (1985)  and 
the  survey  paper  of  Adichie  (1984).  The  hypothesis  under  consideration  typically 
involves  nuisance  parameters  which  require  preliminary  estimation;  the  aligned  (or 
signed)  rank  statistics  are  then  based  on  residuals  from  the  preliminary  estimate.  Alter- 
native approaches  to  inference  based  on  rank  estimation  have  been  considered  by 
McKean  and  Hettmansperger(1978),  Aubuchon  and  Hettmansperger  (1988)  and  Draper 
(1988)  among  others. 

A  completely  new  approach  to  the  construction  of  rank  statistics  for  the  linear 
model  has  recently  been  introduced  by  Gutenbrunner  and  Jureckova  (1992).  Their 
approach  is  based  on  the  dual  solutions  to  the  regression  quantile  statistics  of  Koenker 
and  Bassett  (1978).  These  regression  rank  scores  represent  a  natural  extension  of  the 
"location  rank  scores"  introduced  by  Hajek  and  Sidak  (1967,  Section  V.3.5),  which  play  a 
fundemental  role  in  the  classical  theory  of  rank  statistics.  In  this  paper  we  consider  tests 
of  a  general  linear  hypothesis  for  the  linear  regression  model  based  upon  regression  rank 
scores.  These  tests  have  the  advantages  of  more  familiar  rank  tests:  they  are  robust  to 
outliers  in  the  response  variable  and  they  are  asymptotically  distribution  free  in  the  sense 
that  no  nuisance  parameter  depending  on  the  error  distribution  need  be  estimated  in  order 
to  compute  the  test  statistic.  Furthermore,  they  are  considerably  simpler  than  many  of 
the  proposed  aligned  rank  tests  which  require  preliminary  estimation  of  the  linear  model 
by  computationally  demanding  rank  estimation  methods.  The  robustness  of  the  proposed 
tests  and  the  sensitivity  of  the  aligned  rank  procedures  to  response  outliers  is  illustrated 
in  the  sensitivity  analysis  of  the  example  discussed  in  Section  2. 

In  the  classical  linear  model, 

Y  =  Xp  +  E,  (1.1) 

the  vector  (3(a)  =  (fj^a),...,  pp(a))'  e  R^  of  ath  regression  quantiles  is  any  solution  of 
the  problem 

minXPaQ', -xu't),      t  €  RP  (1.2) 

where 

pa(«)=  l"l  {(l-a)/[w<0]  +  a/[w>0]},     ueR1.  (1.3) 

Least  absolute  error  regression  corresponds  to  the  median  case  with  a  =  Vi.  In  the  one- 
sample  location  model,  with  X  =  1„,  solutions  to  (1.2)  are  the  ordinary  sample  quantiles: 
when  na  is  an  integer  we  have  an  interval  of  solutions  between  two  adjacent  order  statis- 
tics. Computation  of  the  regression  quantiles  is  greatly  facilitated  by  expressing  (1.2)  as 
the  linear  program 

al„'u+  +  (l-a)l„'iT  :  =  min 

X[3  +  u+-u-  =  Y  (1.4) 


$eRP,      u+,  ueR? 

and  1„  =  (1,...,  1)'  g  R",  with  0  <  a  <  1.  Even  in  this  form,  the  problem  of  finding  all 
the  regression  quantile  solutions  may  appear  computationally  demanding,  since  there 
would  appear  to  be  a  distinct  problem  to  solve  for  each  ae(0, 1).  Fortunately,  there  are 
only  a  few  distinct  solutions.  In  the  location  model  we  know,  of  course,  that  there  are  at 
most  n  distinct  quantiles.  In  regression,  Portnoy(1991)  has  shown  that  the  number  of  dis- 
tinct solutions  to  (1.2)  is  Op(n\ogn).  Finding  all  the  regression  quantiles  is  a  straightfor- 
ward exercise  in  parametric  linear  programming.  From  any  given  solution  for  fixed  a  we 
may  compute  the  interval  containing  a  for  which  is  solution  remains  optimal,  and  one 
simplex  pivot  brings  us  to  a  new  solution  at  either  endpoint  of  the  interval.  Proceeding  in 
this  way  we  may  compute  the  entire  path  (5()  which  is  a  piecewise  constant  function 
from  [0, 1]  to  Rp .  Detailed  descriptions  of  algorithms  to  compute  the  regression  quantiles 
may  be  found  in  Koenker  and  d'Orey(1990),  and  Osborne(1992).  Finite-sample  as  well 
as  asymptotic  properties  of  (3(a)  are  studied  in  Koenker  and  Bassett  (1978),  Ruppert  and 
Carroll  (1980),  Jureckova  (1984),  Gutenbrunner  (1986),  Koenker  and  Portnoy  (1987), 
Gutenbrunner  and  Jureckova  (1992),  and  Portnoy(1991b). 

The  regression  rank  scores  introduced  in  Gutenbrunner  and  Jureckova  (1992)  arise 
as  a  /?-vector  a„(a)  =  (a„i(a),...,  ^(a))'  of  solutions  to  the  dual  form  of  the  linear  pro- 
gram required  to  compute  the  regression  quantiles.  The  formal  dual  program  to  (1.4)  can 
be  written  in  the  form 

Y'a(cc)  :  =  max 
X'a(a)  =  (l-a)X'ln  (1.5) 

a(a)e  [0,  If,      0<a<  1 

As  shown  in  Gutenbrunner  and  Jureckova  (1992),  many  aspects  of  the  duality  of  order 
statistics  and  ranks  in  the  location  model  generalize  naturally  to  the  linear  model  through 
(1.4)  and  (1.5).  Moreover,  as  pointed  out  there,  a  is  regression  invariant  with  respect  to 
Xj ,  in  the  sense  that  a(a)  is  unchanged  if  Y  is  transformed  to  Y  +  Xtf  for  any  ye  Rp . 

To  motivate  our  approach,  consider  (a„(a),  0  <  a  <  1 }  in  the  location  model  with 
X  =  1„.  In  this  case,  am(a)  specializes  to 

1  if  cc<(fl,-l)/Ai 

Rj-an  if  (/?,— l)/w  <  a  </?,/«  (1.6) 

0  if  RJn  <  a 


^nt(a)  =  2Ln(Ri,  a)=< 


where  R;  is  the  rank  of  F,  among  Y{,...,  Yn.  The  function 
<£iU->  °0»  y=l,...,  fl.  0<cc<  1,  coincides  exactly  with  that  introduced  in  Hajek  and 
Sidak  (1967,  Section  V.3.5).  Under  the  general  model  (1.1),  both  the  finite-sample  and 
asymptotic  properties  of  the  regression  rank  scores  and  of  the  process 
(a„(a),  0  <  a  <  1 }  are  described  in  the  next  section.  The  regression  rank  score  process 
may  be  efficiently  computed  by  standard  parametric  linear  programming  techniques, 
essentially  as  a  byproduct  of  the  regression  quantile  computation  requiring  no  additional 
computational  effort  only  some  additional  storage.  See  Koenker  and  d'Orey(1990)  for 
algorithmic  details. 


The  formal  duality  between  (3(a)  and  a(a)  implies  that  for  /=!,  ...,  n 


a„,-(a)=< 


I    if       Xri>£jfvPy(0) 

7=1 

P       .  (1.7) 

0  if     Yi<^Xifij(a) 
y=i 


while  the  components  of  a„(a)  corresponding  to  [i  |  y,  =  x,'p(a)}  are  determined  by 
the  equality  constraints  of  (1.5).  Thus,  as  in  the  location  model,  the  regression  rankscore 
for  observation  i  is  one  while  v,  is  above  the  ath  quantile  regression  plane,  and  zero 
when  v,  falls  below  this  plane,  and  taking  an  intermediate  value  while  v,  falls  on  the  ath 
plane.  Integrating  the  regression  rankscore  function  for  each  observation  over  [0,1] 
yields  a  vector  of  (Wilcoxon)  ranks:  observations  falling  "below"  most  of  the  others 
receiving  small  ranks,  while  those  falling  "above"  the  others,  and  thus  having  rankscore 
one  over  a  wide  interval,  receive  large  ranks.  This  observation  is  completely  transparent 
in  the  location  model  where  "above"  and  "below"  have  an  obvious  interpretation.  In 
regression,  the  interpretation  of  these  terms  relies  on  the  optimization  problem  defining 
the  regression  quantiles.  The  resulting  rank  scores  illustrated,  for  example,  in  Figure  6.1, 
are,  we  believe,  a  useful  graphical  diagnostic  in  linear  regression  in  addition  to  their  role 
in  formal  hypothesis  testing. 

The  next  section  of  the  paper  surveys  our  results,  establishes  some  notation,  and 
provides  an  illustrative  example.  Section  3  develops  some  theory  of  the  regression  rank 
score  process.  Section  4  treats  the  theory  of  simple  linear  rank  statistics  based  on  this 
process,  and  Section  5  contains  a  formal  treatment  of  the  proposed  tests. 


2.      Notation  and  preliminary  considerations 

We  will  partition  the  classical  linear  regression  model 

Y  =  XP  +  E  (2.1) 

as 

Y  =  Xip!  +X2p\  +  E  (2.2) 

where  Pi  and  P2  are  P~  and  <7 -dimensional  parameters,  X  =  X„  is  a  known,  nx(p+q) 
design  matrix  with  rows  x„/  =  x,-'  =  (xj,',  x2/-')  e  Rp+q,  /=1,...,  n  .  We  will  assume 
throughout  that  x(1  =  1  for  i  -  \,...,n.  Y  is  a  vector  of  observations  and  E  is  an  nx\ 
vector  of  i.i.d.  errors  with  common  distribution  function  F.  As  in  the  familiar  two- 
sample  rank  test,  our  test  statistic  is  shift-invariant  and  hence  independent  of  location. 
Thus  like  other  rank  tests,  hypotheses  on  the  intercept  cannot  be  tested.  This  is  immedi- 
ately apparent  from  the  regression  invariance  of  the  test  statistic  noted  above.  The  pre- 
cise form  of  F  need  not  be  known  but  we  shall  generally  assume  that  F  has  an  absolutely 
continuous  density  /  on  (A,  B)  where  -00  <A  =  sup{;c:  F(x)  =  0}  and 
+<»>6  =  mf{x:  F  (x)  =  1 }.  Moreover,  we  shall  impose  some  conditions  on  the  tails  of/ 
assuming,  among  other  conditions,  that/monotonically  decreases  to  0  when  x  — >  A  +  ,  or 
x->B-    Define  D„  =/i"1X1'X1, 


Hi  =X1(X1'X1)-%'    and    Q„  =  «"1(X2  -  X2)'(X2  -  X2)  (2.3) 

with  X2  =  HtX2  being  the  projection  of  X2  on  the  space  spanned  by  the  columns  of  Xj. 
We  shall  also  assume 


lim  Dn  =  D,       lim  Qn  =  Q  (2  4) 

where  D  and  Q  are  positive  definite  (pxp)  and  (qxq)  matrices,  respectively. 
We  are  interested  in  testing  the  hypothesis 

H0  :  (32  =  0,      p!  unspecified  (2.5) 

versus  the  Pitman  (local)  alternatives 

Hn:  $2n=n-y2%  (2.6) 

with  (3q  being  a  fixed  vector  in  Rg . 

As  in  the  classical  theory  of  rank  tests,  we  shall  consider  a  score-function 
9 :  (0,  1)  — »  R  which  is  nondecreasing  and  square- integrable  on  (0,  1).  We  may  then 
construct  scores  based  on  the  regression  rankscore  process  following  Hajek  and  Sidak, 
(1967)  as, 

1 

Ki  =  -J y(t)dani(t),      i=l,...,  n.  (2.7) 

0 

Defining 

Sn=/»-1/2(Xfl2-Xn2)'bfl  (2.8) 

where  b„  =  (bn  \t ...,  bnn)\  we  propose  the  following  statistic  for  testing  Hq  against  Hn: 

Tn=Sn'Q.nl$nIA\v)  (2-9) 

where 

1  1 

A  2(q>)  =  J  (9(0  -<?)2dt,      9  -  J  <p(04  (2.10) 

0  0 

and  with  Q„  defined  as  in  (2.3).  An  important  feature  of  the  test  statistic  Tn  is  that  it 
requires  no  estimation  of  nuisance  parameters,  since  the  functional  A  (9)  depends  only  on 
the  score  function  and  not  on  (the  unknown)  F.  This  is  familiar  from  the  theory  of  rank 
tests,  but  stands  in  sharp  contrast  with  other  methods  of  testing  in  the  linear  model  where 
typically  some  estimation  of  a  scale  parameter  of  F  is  required  to  compute  the  test  statis- 
tic. See  for  example  the  discussion  in  Aubuchon  and  Hettmansperger  (1988)  and  Draper 
(1988). 

We  shall  show  in  Section  5,  that  the  asymptotic  distribution  of  Tn  under  Hq  is  cen- 
tral X"  witn  Q  degrees  of  freedom  while  under  Hn  it  is  noncentral  x~  w^h  Q  degrees  of 
freedom  and  noncentrality  parameter 

ri2  =  [<y2 (<p,  F)  I  A  2(9)]po'QPo  (2-1 1) 

where 


1 

yiy,F)  =  -j(p(t)df(F-l(t)).  (2.12) 

o 

Like  A,  y  is  also  familiar  from  the  classical  theory  of  rank  tests.  The  test  statistic  Tn 
is  first-order  asymptotically  distribution  free  in  the  sense  that  the  first-order  term  in  its 
asymptotic  representation  is  exactly  distribution  free,  as  follows  from  (4.2).  Moreover, 
it  follows  from  (2.11)  that  the  Pitman  efficiency  of  the  test  based  on  Tn  with  respect  to 
the  classical  F  test  of  Hq  coincides  with  that  of  the  two-sample  rank  test  of  shift  in  loca- 
tion with  respect  to  the  f-test.  For/unimodal,  we  obtain  an  asymptotically  optimal  test 
if  we  take 

(P(0  =  cp/<0  =  -/y(F  ,(0),      0  <  r  <  1.  (2.13) 

Thus  for  Wilcoxon  scores  (see  below)  the  asymptotic  relative  efficiency  of  the  test 
based  on  Tn  relative  to  the  classical  F  test  is  3/JC  =  .955  at  the  normal  distribution  and  is 
bounded  below  by  .864  for  all  F.  When  F  is  heavy  tailed  this  asymptotic  efficiency  is 
generally  greater  than  one,  and  can  in  fact  be  unbounded.  For  normal  (van  der  Waerden) 
scores  ((p(w)  =  0_1(«))  the  situation  is  even  more  striking.  Here  the  test  based  on  Tn  has 
asymptotic  efficiency  greater  than  one,  relative  to  the  classical  F  test,  for  all  symmetric 
F,  attaining  one  at  the  normal  distribution.  See  e.g.  Lehmann  (1959,  p.  239),  and  Leh- 
mann(1983,pp  383-87). 

Let  us  now  examine  more  closely  the  scores  (2.7),  which  can  be  written  as 

l 

ki  =  -J  q<t)ani'(t)dt     i=l,...,  n  (2.14) 

o 

where  the  functions  ani'(t)  =  dant(t)ldt  are  piecewise  constant  on  [0,1].  The  piecewise 
linearity  of  the  regression  rank  scores  follows  immediately  from  the  linear  programming 
formulation  (1.5)  of  the  dual,  greatly  simplifying  the  computation  in  (2.21).  In  the  loca- 
tion model,  using  (2.13)  this  reduces  to  the  well-known  Hajek  and  Sidak  (1967)  scores 

bni-n     J    ty(t)dt,      i  =  \,...,  n 

There  are  three  typical  choices  of  (p: 

(i)     Wilcoxon         scores:  q>(t)  =  t-\/2,      0  <  t  <  1.  The         scores         are 

K  =  -\{t  -  l/2)da,(r)  =  \li{t)dt  -  1/2  while  A 2(cp)  =  1/12,  and     7(9,  F)  =  jf2(x)dx. 
Wilcoxon  scores  are  optimal  when  /is  the  logistic  distribution. 

(ii)  Normal  (van  der  Waerden)  scores:  (p(f)  =  O-1  (r),  0  <  t  <  1,  O  being  the  d.f.  of 
standard  normal  distribution.  Here  A2((p)  =  1  and  y((p,  F)  =  jf(F~l  (Q>(x)))dx.  These 
scores  are  asymptotically  optimal  when /is  normal. 

(iii)  Median  (sign)  scores:  cp(r)  =  Visignit-Vz),  0<t<  1,  then  (2.7)  leads  to  the 
form  b„i  =  aw('/2)  -  Vi  which  is  V2  if  the  tth  l\  residual  is  positive  and  -Vi  if  it  is 
negative,  and  between  —Yi  and  Vi  otherwise. 


Figure  2. 1 
Regression  Rank  Scores  for  Tobacco  Data 
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Figure  2. 1 
(continued) 
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Figure  2.2 


Sensitivity  Curves  for  Rank  Tests 
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Perturbation  of  y(1) 


REMARK.  Using  the  standard  reduction  to  canonical  form  e.g.  Scheffe  (1959,  Section 
2.6)  or  Amemiya  (1985,  Section  1.4.2),  we  may  consider  a  more  general  form  of  the 
linear  hypothesis 

R'P  =  r  e  Ri 

where  R  is  a  (p  +  q)  x  q  matrix  of  rank  q  <  p.  Let  \  be  a  (p  +  q)  x  p  matrix  such  that 
A  =  [V:R]'  is  nonsingular  and  R'V  =  0.  Set  y=  A(3  and  Z  =  XA_1.  Partitioning 
Y=  [Yi' ,  TfeT  where  Yi  =  V'j3  and  y2  =  R'(3,  under  the  hypothesis  (2.22)  we  have 

Y  -  XR(R'R)-1  r  =  XVCV'V)"1  Yi  +  E. 

Thus,  in  view  of  the  equivariance  of  regression  quantises,  see  Koenker  and  Bassett(1978), 
Theorem  3.2,  we  may  define  Y  =  Y  -  XR(R,RJ_1  r,  Xi  =  XV(V'V)_1 ,  X2  =  XR(R'R)1 , 

and  proceed  as  previously  discussed  with  (Y,  Xj,  X2)  playing  the  roles  of  (Y,  Xj,  X2). 
By  this  device  the  tests  described  above  and  detailed  in  Section  5  below  may  be  extended 
to  a  wide  range  of  applications  including,  for  example,  the  hypotheses  of  parallelism  and 
coincidence  of  regression  lines  discussed  by  Adichie  (1984)  and  others. 

To  illustrate  the  tests  proposed  above  we  consider  briefly  an  example  taken  from 
Adichie  (1984,  Example  3)  dealing  with  the  combustion  of  tobacco.  The  log  of  the  leaf 
burn  (in  seconds)  of  30  batches  of  tobacco  is  thought  to  depend  upon  the  percent  compo- 
sition of  nitrogen,  chlorine,  and  potassium.  Adichie  suggests  testing  the  potassium  effect 
and  describes  an  aligned  rank  version  of  the  test.  We  are  unable  to  reproduce  some 
details  of  his  calculations,  however,  using  his  approach  we  get  least  squares  estimates  of 
the  nitrogen  and  chlorine  effects  of  -.529  and  -.290  with  an  intercept  of  2.653.  With 
these  preliminary  estimates  we  obtain  aligned  (Wilcoxon)  ranks 

7  17  2  18  6  1  11  3 
25  16  4  29  26  27  21  23 
28       10       8        15       24       20       22         5 

which  yield  a  test  statistic  of  13.59  highly  significant  relative  to  the  1%  %l  critical  value 
of  6.63. 

The  full  set  of  regression  rank  scores  a,(r)  for  the  restricted  model  ecluding  potas- 
sium for  this  data  are  illustrated  in  Figure  6.1.  There  are  34  distinct  regression  quantile 
solutions  and  therefore  each  a„i(t)  is  a  piecewise  linear  function  with  at  most  34  distinct 
segments.  Recall  that  ani(t)  =  1  while  the  observed  v,  is  above  the  tth  regression  quantile 
plane,  0  while  below,  and  takes  some  intermediate  value  when  yi  falls  on  the  tth  plane. 
The  plots  ordered  according  to  their  Wilcoxon  rank  score,  which  may  be  computed  as 

b, =-\(t  -  l/2)da,(r)=  \dii{t)dt  -  1/2  .    While  the  Wilcoxon  rank  scores  provide  an 

o  o 

unambiguous  ranking  of  the  observations,  since  the  regression  rank  score  functions  typi- 
cally cross  in  regression  applications,  in  contrast  to  the  location  model,  this  ranking 
depends  upon  the  score  function  employed.  The  regression  rank  score  plots  give  some 
further  visual  evidence  concerning  the  ranking  of  the  sample  observations.  Note  that  if 
ani{t)  >  anj(t)  for  all  t,  then  bni  >  bnj  for  any  montone  score  function  cp.  Numerical  calcu- 
lations give  Wilcoxon  ranks 

-0.27        0.06       -0.41        0.09       -0.32       -0.48       -0.17       -0.38        0.48       -0.06 


30 

13 

19 

12 

14 

9 

0.23        0.04       -0.37        0.42        0.28        0.37        0.19        0.41        0.15       -0.26 
0.38       -0.16       -0.23       -0.01        0.33        0.12        0.15       -0.42       -0.10       -0.06 

and  yield  a  test  statistic  of  13.17.  In  view  of  Theorem  5.1  the  approximate  p- value  is 
.0003.  The  two  vectors  of  Wilcoxon  ranks  correspond  closely.  Observation  6  is  smallest 
in  both  rankings  and  observations  14  and  9  are  largest  in  both.  The  simple  correlation 

between  the  two  rankings  is  .978.  Note  that  as  a  practical  matter  when  (p=  \  <p(t)dt  =  0, 

o 
we  may  omit  the  X2  term  in  the  computation  of  S„  in  (5.3)  since  b„  is  orthogonal  to  Xj. 
This  is  in  contrast  with  the  aligned  rank  situation  where  the  use  of  X2  -  X2  is  essential. 
Corresponding  calculations  for  the  normal  scores  using 

Z?/=-J<D-1(0^a/(0=lS,'(r;)[(l)(cD-1(0))-(l)(O-1(0-i))] 

0  i=l 

where  <j)  denotes  the  standard  normal  density,  and  tt  is  the  /th  regression  quantile  break- 
point yields 

-0.74  0.15  -1.41  0.23  -0.91  -2.13  -0.45  -1.17  2.08  -0.15 
0.63  0.10  -1.25  1.44  0.78  1.15  0.50  1.35  0.40  -0.72 
1.41       -0.40       -0.61       -0.03        0.94        0.30        0.39       -1.45       -0.26       -0.18 

and  a  test  statistic  of  12.87.  The  corresponding  normal  score  aligned  rank  statistic  is 
11.72. 

Finally,  regression  rank  score  version  of  the  sign  test  yields  the  scores 

-1.00  1.00  -1.00  1.00  -1.00  -1.00  -1.00  -1.00  1.00  -1.00 
1.00  1.00  -1.00  1.00  1.00  1.00  1.00  1.00  0.16  -1.00 
1.00       -1.00       -1.00       -0.37        1.00        1.00        1.00       -1.00       -1.00       -0.79 

and  a  test  statistic  of  8.42  while  the  aligned  rank  sign  scores  yield  10.20.  Note  that  we 
have  multiplied  the  sign  scores  by  2  to  conform  to  conventional  useage.  Obviously,  all 
versions  of  the  tests  lead  to  a  decisive  rejection  of  the  null.  Note  that  for  the  sign  scores 
the  test  coincides  with  the  / 1  Lagrange  multiplier  test  discussed  in  Koenker  and 
Bassett(1982). 

Since  an  important  objective  of  the  proposed  rank  tests  is  robustness  to  outlying 
observations,  it  is  interesting  to  observe  the  effect  of  perturbing  one  of  the  y  observations 
of  the  Adichie  data  set  on  the  aligned  and  rank  scores  versions  of  the  test  statistic.  This 
sensitivity  analysis  is  illustrated  in  Figure  6.2.  Even  a  modest  perturbation  in  y  \  is 
enough  to  confound  the  initial  least  squares  estimate  and  reverse  the  conclusion  of  the 
aligned  rank  test.  Adding  10  to  the  first  response,  for  example,  alters  the  aligned  Wil- 
coxon test  statistic  from  13.58  to  5.7,  which  is  no  longer  significant  at  1%.  and  the  vector 
of  ranks  based  on  the  perturbed  data  has  a  correlation  of  only  .48  with  the  aligned  ranks 
based  on  the  original  data.  The  same  perturbation  of  v  i  changes  the  Wilcoxon  regression 
rankscore  test  statistic  from  13.17  to  14.70  with  a  correlation  between  the  two  rank  vec- 
tors of  .87.  A  more  robust  initial  estimator  would  improve  the  performance  of  the 
aligned  rank  test  somewhat.  The  regression  rank  score  version  of  the  test  is  seen  to  be 
relatively  insensitive  to  such  perturbations.  One  should  be  aware  that  comparable  pertur- 
bations in  the  X2  design  observations  may  wreck  havoc  even  with  the  rank  score  form  of 
the  test.  Recent  work  of  Antoch  and  Jureckova  (1985)  and  deJongh,  deWet,  and  Welsh 


(1988)   contain    suggestions   on    robustifying   regression   quantiles   and   therefore    the 
corresponding  regression  rank  scores  to  the  effect  of  influential  design  points. 

Computation  of  the  tests  was  carried  out  in  5+  using  the  algorithm  described  in 
Koenker  and  d'Orey  (1987,  1990)  to  compute  regression  quantiles. 


3.      Properties  of  regression  rank  scores 

Consider  the  linear  regression  model  (2.1)  with  design  X„  of  dimension  n  xp.  Let 
(3(a)  e  Rp  be  the  a-regression  quantile  and  a(a)  e  Rn  be  the  vector  of  ath  regression 
rank  scores  defined  in  (2.7).  We  see  from  the  form  of  the  linear  constraints  in  (1.5)  that 
the  regression  rank  scores  are  regression  invariant,  i.e., 

a„(a,  Y+Xb)  =  a„(a,  Y),      b  e  R^.  (3.1) 

Moreover,  in  view  of  the  invariance,  we  may  assume 

n 

£*//=0,    ;'=2,  ...,p  (3.2) 

j=l 

without  loss  of  generality. 

Our  primary  interest  in  this  section  will  be  the  properties  of  the  regression  rank 
scores  process 

{a„(r):  0<r  <  1}.  (3.4) 

Gutenbrunner  and  Jureckova(1992)  studied  the  process 

W^  =  {W^(0  =  ^idwaw(0:  0<r  <  1}  (3.5) 

and  showed  that  Wdn{t)  =  Udn{t)  +  op  (1)  where 

Udn(t)  =  n-U2  %d*I[Ei>F-1(f)]  (3.6) 

as  n  — >  oo  uniformly  on  any  fixed  interval  [e,  1-e],  where  0  <  £  <  1/2  for  any  appropri- 
ately standardized  triangular  array  {dm  :  /=1,...,  n }  of  vectors  from  R^  They  also 
showed  that  the  process  (3.4)  (and  hence  (3.5))  has  continuous  trajectories  and,  under  the 

n 

standardization  ^dni  =  0,   (3.5)  is  tied-down  to  0  at  t  =  0,  and  t  =  1.  The  same  authors 

/=i 
also  established  the  weak  convergence  of  (3.5)  to  the  Brownian  bridge  over  [e,  1-e]. 
Note  however  that  Theorem  V.3.5  in  Hajek  and  Sidak  (1967)  establishes  the  weak  con- 
vergence of  (3.5)  to  the  Brownian  bridge  over  the  entire  interval  [0,  1]  in  the  special  case 
of  the  location  submodel.  Here  we  extend  the  results  of  Gutenbrunner  and  Jureckova 
(1992)  into  the  tails  of  [0,1],  in  order  to  find  the  asymptotic  behavior  of  the  rank  scores 
and  the  test  statistics  (2.7)  and  (2.8),  for  which  the  score  functions  are  not  constant  in  the 
tails. 

It  may  be  noted  that  this  extension  is  rather  delicate.  If  the  rank  scores  involved 
integration  from  e  to  1-e  (i.e.,  if  cp  were  constant  near  0  and  1),  then  the  earlier 
Gutenbrunner-Jureckova  (1992)  representation   theorem  could  be  used  to  obtain   the 


asymptotic  distribution  theory  here  under  somewhat  weaker  hypotheses  (see  the  remark 
following  Theorem  5.1).  It  is  the  desirability  of  treating  such  tests  as  the  Wilcoxon  and 
Normal  Scores  Tests  that  requires  the  extensions  here.  Nonetheless,  the  fact  shown  here 
that  the  rank  score  process  can  be  represented  uniformly  on  an  interval  (oc„,  1-a^)  with 
a„  decreasing  as  a  negative  power  of  n  (precisely,  cc^  =^_1/(1+4ft)  for  some  b>0)  is 
rather  remarkable  and  of  independent  theoretical  interest. 

To  this  end,  we  will  assume  that  the  errors  E\,  ...,  En  in  (2.1)  are  independent  and 
identically  distributed  according  to  the  distribution  function  F(x)  which  has  an  abso- 
lutely continuous  density  /.  We  will  assume  that  /  is  positive  for  A  <  x  <  B  and 
decreases  monotonically  as  X-+A  +  and  x—>B-  where 


<A  =  sup  {x:F(x)  =  0}     and    +°°>fl  =  inf  {x:  F(x)  =  1 }  . 
For  0  <  a  <  1,  let  \\fa  denote  the  score  function  corresponding  to  (1.2): 

\\fa(x)  =  a-I[x  <0],     xeR1.  (3.7) 

We  shall  impose  the  following  conditions  on  F: 

(F.l)         |F_1(a)|  <c(a(l-a))~a  forO<a<ao,   l-a<)  <a  <  1,  where  0  <  a  <'A  -e, 
e  >  0  and  c  >  0. 

(F.2)         l//(F_1(a))  <  c(a(l-a))"1_a  for  0  <  a  <  Oo  and  l-oo  <  a  <  1,  c  >  0. 

(F.3)        /  (x)  >  0  is  absolutely  continuous,  bounded  and  monotonically  decreasing  as 
x  — »  A  +  and  x  — »  B  -.  The  derivative  f  is  bounded  a.e. 

(F.4)  I  Qtt  I   ^  c  \x  |  for  \x  |  >  K  >  0,  c>  0. 

I  fix)   I 

REMARK.  These  conditions  are  satisfied,  for  example,  by  the  normal,  logistic,  double 
exponential  and  t  distributions  with  5,  or  more,  degrees  of  freedom.  Condition  (F.l) 
implies  [\t |4+5i/F(r)  <  +»  for  some  8  >  0.  Hence  using  (F.4)  also,  F  has  finite  Fisher 
Information,  a  fact  to  be  applied  in  Theorem  5.1. 

The  following  design  assumptions  will  also  be  employed. 
(X.l)        xn  =  l,  i=l,...,n 
(X.2)         limD„  =  D  where  D„  =  n~lXn'Xn  and  D  is  a  positive  definite p  xp  matrix. 

(X.3)        »-1f;|te||4  =  0(l)as«->«o. 

(X.4)        max|lx/||  =  0{n{2(b~a)~mi^b))  for  some  b  >0  and  5>0  such  that  0  <  b-a  <  e/2 

\<i<n 

(hence  0  <  b  <  'A  -  e/2). 
We  may  now  define 

an  =  n~m+4b)      and     aa  =  ^l-^'2  ,  0  <  a  <  1  .  (3.8) 

Let  C  be  a  fixed  constant  and  define 

Cn  =  C  (\og2n)'/2  •  ,  (3.9) 
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We  now  prove  the  following  crucial  lemma: 

LEMMA  3.1      Assume  that  F  satisfies  (F.l)  -  (F.4)  and  that  X„  satisfies  (X.l)  -  (X.3). 
Then,  as  n  ->°°, 

sup{|r„(t,  a)  |  :  ||t||  <cn,  a^a^l^HO  (3-10) 

for  Cn  given  by  (3.9),  where 

r„(t,  a)  =  (aO-a))-1^1  £[Po(E.o  "  «"1/2aax,'t)-  pa(E/a)] 

i=i 

+  Ai-1/2(a(l-a))-1/2Xx,'tvi/a(£)a)  -  >/*t'Dnt  (3.11) 

i=l 

and 

£,„=£,— F"1  (a),  i=l,  ...,  n.  (3.12) 

PROOF. 

(i)    First  fix  a  e  [a*,  1-a*]  and  t  such  that  |t[|  <  Cn. 
Define  for  some  0  <  y  <  b, 


B„  =  max 


"n 


n-2al{\+4b)^  n-(2-y)(b-a)/(l+4b) ^  n-(b-y)/(l+4b) 


(3.13) 


We  wish  to  show  that  for  any  X  >  0 

P(\rn(t,a)\>(\+l)Bn)<Kn-x  (3.14) 

with  a  fixed  K  >  0.  To  do  this,  we  will  use  the  Markov  inequality 

P(\rn(t,  a) |  >sn)<exp(rusn)(M(u)+M(ru)),    u>0  (3.15) 

where  M (u)  =  £exp(wr„(t,  a)). 
Denote 

em  =  E„/(t,  a)  =  «-1/2aax('t  (3.16) 

and 

Ri(t,  a)  =  (ail-a^'WiPaiEia-n^CaX/thpaiEia)]      (3.17) 

+  /|-1/2(a(l-a))-1/2x//tva(£/a)-l/2/i-1(x//t)2   /=1, ...,«. 
By  definition  of  £/a,  aa,  pa  and  \\fa, 

R,(t,  a)+1/2Al-1(x/'t)2  =  (a(l-a)rl/2G-al{(Eia-Em)I[eni<Eia<0] 

+  (tni-Eia)I[0<Eia<eni]}  (3.18) 

and  hence,  uniformly  for  a*n  <  a  <  1-a^,  ||t||  <  Cn  and  i=l,...,n, 

|/?,(t,  a)  +  '/2«-1(x('t)2|  <2«-1/2(a(l-a)r1/2|x('t|=0(/2-{2a+5)/(1+4fe)  (\og2n)'/2).  (3.19) 

If    uRj   is  bounded,  that  is,  0  <  u  <  ai(2      ^l+4b\\og2n)~'/2 ,  Taylor  series  expansion 
yields 
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log  MR.(u)  <  uERi(U  a)  +  cw-Var(/?,(t,  a))  (3.20) 

for  some  constant  c  >  0.  By  (3.18),  for  e„,  >  0  and  for  oc^  <  a  <  Oq,    1-a^  >  a  >  1-oto, 

ERi(t,  a)  =  -1/2Ai-1(x/'t)2  +  (a(l-a))-1/2aa1  j  (eni-z)f(z  +  F~l  (a))dz  . 

o 

Now, 

log  f(z  +  F -1  (a))  =  logAF"1  (a))  +  f  -f-  log  /(«  +  F -1  (a))  </u  ; 

o  du 

or,  by  condition  F.4, 

z 

/(z+F-1(a))</(F-1(a))exp{J(W+  IF-^a)!)^}   . 

o 

Also  by  (3.8),  3.16)  and  the  conditions, 

2(fe-a>-8 


e«  I/7"1  (a) |  =  0(n~l/l  a^~2a  /i     l+4b     (log2A2)v/2)  -> 0 


Hence, 


z  z 

exp{}(«+  |F_1(a)l)d"}  <  1  +  cj(  |F-1(a)|  +  u)du  . 
o  o 

Therefore, 

ERi(U  a)  =  -1/2AI-1(x,'t)2  +  (a(l-a))-1/2aa1/(^"1(a))  {  J(ert,-z>fe 


+  0(1)  \(eni-z)j(\F-l(a)\  +u)dudz)  .  (3.21) 

0  0 

By  (3.8)  and  (3.16),  the  first  integral  in  (3.21)  exactly  cancels  -xhn~  (x,'t)    ;  and,  there- 
fore, using  conditions  F.l  -  F.4, 

£/?/(t,  a)<c(a(l-a)r1/2-2aAr3/2|x/,t|3+c(a(l-a)r1-2a/r2|x('t|4  .     (3.21) 

We  get  the  same  inequality  for  zm  <  0.    The  same  expressions  are  0(n       |x,'t|  ) 
+  0(A2~2|x,'t|4)  if  ao<a<l-ao.  Hence, 


r      -?r/,-rtn 


XE|*/(t,a)|  =0 


(*-*)' 


1+4* 


(3.22) 


Similarly,  using  (3.18)  and  (3.21), 

2 

{/(F-^ctf)    J  (|em|-z)2[l+J(l^~1(a)l+>')^]^ 


Var/?,-(t,  a)  < 


—l 


f(F-l(a)) 


a(l-a) 


|e„,l 

J 

0 
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+  /2(F-1(a))<! 


1  2 


J  (\eni\-z)[\+)(\F-l(a)\+y)dy]dz 
o  o 


Therefore,  using  (3.8), 


,-3/2 


■'/2 


.'*|3    _ 


XVar^(t,  a)<cn_J/- (a(l-a)r/22lx,t|J  =  0 
i=l  /=i 


-2fc 
1446 


(3.23) 


These  results  hold  uniformly  in  a  and  t. 

Hence,  using  (3.15)  and  (3.13)  with  u  =  log  nlBn  =  0(n2a/(1+4b)),  so  that  3.20  holds, 

P(\rn(t,  a) |  >a+l)Z?n)<exp  {-(?i+l)log« 


Kb -a) 


-lb 


+(/Hog  n  IBn)-n     1+4fe     +  (K  \og2n/B2n)  •  n  l+4b  } 


<n 


-X 


(3.24) 


for  n  >  no  where  K  >  0  and  aiq  do  not  depend  on  a  and  t. 


(ii)  Now  apply  the  chaining  argument  to  extend  (3.24)  uniformly  in  (t,  a).  Following 
the  proof  of  Lemma  A. 2  in  Koenker  and  Portnoy  (1987),  choose  intervals  of  length  l/n 
covering  [a^,  l-a„]  and  balls  of  radius  \ln5  covering  {t:  ||t|j  <Cn).  Let  {a!,  a^}  lie  in 
one  of  the  intervals  and  (t\ ,  to)  lie  in  one  of  the  balls  covering  {t:||t||  <  Cn).  We  now  use 
(3.18)  to  bound  A,  =  |/?,(ti,  a.\)- Rt{U,  a->)|.  So  define  intervals  jf  as  follows  for 
/  =  1,2: 

J i  =  [F-l(ad,  em(t/,a/)  +  F"1(a/)]  JJ  =  [eMM)  +  F~l (at),  F~l(ai)]  . 


Also  define  (for  /  =  1,  2): 

Gi(Ei)  = 
Then,  from  (3.18), 


f(F~Hai)) 
a/(l-a/) 


(Ei-F-l(OLi)-Eni(thai)) 


A,  =  -^-(x/'ti)2  +  ^-(x/t2)2  +  H{Ei\ 
in  in 


(3.25) 


where 


//(£,)  = 


Gl{Ei)-G1(Ei) 
G2(Pi)-Gi(JEi) 

a; 

0 


Et  e  f\  nJt 

Ei  €  (ft  nJiJu  (7J  n  f\ ) 
otherwise 


and 
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A/  <  max/^  2 


a/(l  -  a,) 


C  n~'A  ||x,-| 


e«(t/,a/)| 


2(6 -a) 


=  max/=1  2 


/I"*4  IXyt/l 


(a/(l-a/))' 


<  Cnn 


-J/2  n     1+46      n  2(1+46)     _ 


(a,(l-a,))* 

Now  note  that  for  ||t||  <  Cn  and  a*  <  a/  <  1  -a„, 
I  eni(ti,a1)-em-(t2,a2)|  < 


=  Cn  n  L 


1+46 


->0 


C„  «"//2  Hx.-I 


+  C„  n 


-'/: 


mnWCF-V)) 
1  1 


|(a1(l-a1))%-(a2(l-a2))'/M 

Hti-t2|| 


c  n 


-'/: 


f(F-l(ax))      f(F-l(a2))  '         min^f-1  (a)) 
By  conditions  F.l  and  F.2,  for  a  €  [a*n,  1  -  a^],  l/a(l-a)  <  n  , 


mina/CF-^a))       (a(l-a)) 


\+a 


<n 


5/4 


and 


|F-1(a1)-F-1(a2)|  <  — ^— ^— <  ,T3-75 

min^F-^a)) 


-5 


-5 


Therefore,  using  X.4  and  the  fact  that  |  ax  -  a2  |  <n  and  ||ti  -t2||<Ai^  ,  it  is 
straightforward  to  show  that  the  contributions  to  (3.25)  excluding  A,  are  all  o(l).  Since, 
H[(Ej)  =  A*  (3.25d)  only  if  F,  is  between  F~l(a.i)  and  F_1(oc2) ,  otherwise,  the  inter- 
sections of  intervals  defining  A/  must  be  empty, 

sup  |  £/?,(ti ,a{)~  £/?/(t2,  <x2)  |  < o(l)  +  Ko{\) 


1=1 


1=1 


where  Sv  denotes  the  covering  set  containing  (oti,  ti)  and  (0:2,  t2),  and  K"  is  the  number 
of  times  F,   lies  between  F_1(ai)  and  F_1(cc2).  Now,  K  -  binomial^,  p)  where  p 
is  the  probability  that  F,   lies  between  F-1^!)  and  F_1(a2)  with    |ai  -  a2  |  <  n~5 . 
Thus,  since  /  is  bounded,  p  <c*«~3"75  <n    .  Therefore, 


P<  sup|r„(t1,a1)-rn(t2,a2)|  >o(l)  +  Xo(l)\  <  £ 


n  r/P 


*=X 


*1  «-* 


1- 


<'   —X. 
-C  n 


Since  the  number  of  sets  needed  to  cover  the  set  5  =[cc„,  l-a„]x{t:  ||t||<C„}   is 
bounded  by  n5(p+l)  we  obtain  from  (3.24)  for  X  >  5(p+\) 

p\    sup     \rn(t,a)\>(k+l)Bn+o(l)  +  Xo(\)[<n5(p+l)n-x^O  □ 

[(o.t)e5  J 

LEMMA  3.2.    Assume  the  conditions  of  Lemma  3.1  and  let  dn  =  (dn\,  •  •  •  ,dnnY  be  a 
sequence  of  ^-vectors  satisfying 
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Xn'd„  =  0,      - £d2,  ->  A2,  0  <  A2  <  oo 


w/=i 


(D.l) 


-1 


3  _ 


n-lZ\d„i\ J=0(l)    as    «-+oo 
;=i 


max  |  ^m- 1  =  (9 

l</'<n 


// 


(2(6-a>-5)/(l+4&) 


(D.2) 
(D.3) 


Then,  with  S*  =  {(t,  a) :  ||t||  <C,  a*n  <a<\  -  an }, 


y-1/2  „  -1/2 


sup    ((aCl-a^^/i-^IS^tVaCE/a-rt-^aaX/'O-VaC^a)]!   }  ->  0  (3.26) 
(a,t)eS  /  =  1 

as  n  — >  oo  for  any  fixed  C  >  0  and  for  cc^  given  in  (3.8). 


PROOF.    Consider  the  model 

Y  =  X*p*+E 
where  X*  =  (X„    |   d„),     p*  =  (pi,  ■  •  • ,  Pp,  pp+1. 


Pp^).  Then 


X*'X*  = 


X„  x„ 


0 


d„'dn 


and  the  conditions  of  Lemma  3.1  are  satisfied  even  when  replacing  X  by  X  and  taking 
te  Rp+q.  Now,  the  quantity  in  brackets  in  (3.26)  is  just  the  right  derivative  of  (3.11) 
with  respect  to  the  last  q  coordinates  of  t  (evaluated  when  the  last  q  coordinates  of  t  are 
zero).  To  obtain  the  desired  uniform  convergence,  let  /„(t,  a)  denote  the  right  hand 
side  of  (3.11)  without  the  last  term,  xh\!Dnl,  and  let  g(t)  =  Vit'Dt.  Note  that  Vit'Dnt 
can  be  replaced  by  g(t)  since  Dn-*D  (and  ||t||  is  bounded  on  S  ).  By  Lemma  3.1, 
choose  S„  so  that 

sup    |/„(t,  a)-£(t)|  <62  . 

(a,t)eS 

Following  Rockafellar  (1970,  Thm.  25.7,  p.  248),  the  convexity  of  fn  makes  the  differ- 
ence quotients  monotonic.  That  is,  with  u  a  properly  chosen  coordinate  vector, 

^/„(t,a)<^-(/-,,(t  +  8nu,a)-/n(t,a))^-~(g(t  +  5llu)-^(t))  +  ~-  0(52)  . 
Olj  on  o„  on 

Replacing  u  by  -u,  the  reverse  inequality  follows  similarly  (with  minus  signs  on  the  right 
side).  Therefore, 

S(t  +  8„u)-s(t) 


dt 


(fn(t,a)-g(t)\  < 


dt 


-8(t)     0(8n) 


Since  g  is  a  quadratic  function,  this  last  term  tends  to  zero  as  a  constant  times  5„  (uni- 
formly on  5*).  This  gives  (3.26),  since  the  contribution  of  the  final  term  of  (3.11)  van- 
ishes when  differentiating  with  respect  to  the  last  q  coordinates  and  setting  these  coordi- 
nates to  zero.     □ 
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Let  P„(oc)  be  the  a-regression  quantile  corresponding  to  the  reduced  model  (2.1) 
with  the  design  matrix  of  order  (nxp);  i.e.,  (3„(a)  is  a  solution  of  the  minimization 

n 

ZPa(Yi  ~  x('t)  :  =  min,    t  e  Rp .  (3.27) 

Analogously,  define  (3(a)  =  (F~  (a),  0,  0,  .  .  .  ,  0);  that  is,  the  solution  to  (3.27)  when 
the  summation  is  replaced  by  expectation.  The  following  theorem  establishes  the  rate  of 
consistency  of  regression  quantiles,  and  is  needed  for  the  representation  of  the  dual  pro- 
cess. 

THEOREM  3.1.    Under  the  conditions  (F.l)  -  (F.4)  and  (X.l)  -  (X.4), 

«1/2a^1(r3(a)-P(a))  =  Al1/2(a(l-a)r1/2D;1  £  x^a(Eia)  +  op(l)        (3.28) 

/=i 

uniformly  in  a*,  <  a  <  l-a„.  Consequently, 

sup      t\\nl/2G-al(k(a)-m))\\  =  Op(l).  (329) 

a*  <,a<  l-a„  v   '      ' 


PROOF.    If  p„(a)  minimizes  (3.27),  then 

T„a=A21/2a-1(Pn(a)-(3(a))  (3.30) 

minimizes  the  convex  function 

G„a(t)  =  (a(l-a)rv2a-al  £[pa(E,a->r1/2aaX;'t)-pa(£/a)]  (3.31) 

with  respect  to  t  €  Rp .  By  Lemma  3.1,  for  any  fixed  C  >  0 

min  Gna(t)  =  min  {-t'Z„a  +  lM'Dnt}  +  op{\)  n  32) 

||t||<C„  ||t||<C„  F  V'D*> 

uniformly  in  ajj  <  a  <  1-a*,  where 

Zna  =  «-1/2(a(l-a)r1/2  £x,va(£/a).  (3.33) 

y=i 

It       will       be       necessary       to       provide       a       probabilistic       bound       for 
B  =  supf  Z„a  :  a*.  <  a  <  l-a*J.  Writing 

Z„a  = H    '    „    £  {(l-a)(/fF(£,)<a;-a)  +  a(/{F(£,-)  <  1-a}  -(1-a)))  , 

(a(l-a))/2  /=i 

the  invariance  theorem  of  Shorack  (1991)  can  be  applied.  Using  conditions  X.3,  X.4,  and 
the  fact  that  a*  >  n~v\  equation  (1.10)  or  (1.11)  of  Shorack  (1991)  imply  that 

B  <Op(\)  +csup{(5(l-5))",/2  W(.s):  an<s<\-an  ) 

for  some  constant  c,  where  W(s)  is  a  Brownian  Bridge.  This  last  supremum  is  bounded 
by  (log2Ai);/2  +0,(1)  (  see,  for  example,  Shorack  and  Wellner  (1986),  p.  599).  Thus 
Zna  =  Op((\og2n)'/2)  uniformly  on  a*  <  a  <  1-a*.  Therefore,  denoting 
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U„a  =  org  min{-t'Z„a+1/2t'D„t},  (3  34) 


we  immediately  get 


U„a  =D;1Zfla  =  0p((log2/i)i4)  (3.35) 


uniformly  in  an  <  a  <  l-a„  and 

min  { -t/Zno+,/it'Dnt }  =  -lAZna\}~1  Zna.  ,3  36) 

teRP  \j.jvj 

From  (3.35)  and  (3.36),  we  can  write 

-t'Z„a  +  1/2t/D/Jt=  1/2{(t-Uria)'D/l(t-U„a)-U„a'DA1Una }  (3.37) 

and  hence  we  could  rewrite  (3.10)  in  the  form 

sup   |r„(t,a)|=   sup  {Gna(t)-V2[(t-VnaYDn(t-Vna)-VnaDn\Jna]}^0.    (3.38) 

(cU)eS  (a,t)eS 

Inserting  Una  =  Op(S\o%2n)   )>  f°r  t»  we  further  obtain 

sup        <{|G„a(U„a)  +  1/2U/ia'D„Una|    }  =a,(l).  n3m 

We  would  like  to  show  that 

sup      ,{||Tna-Una||  }-op (1).  r340N 

a„  <  a  <  1  -  a„  v   •      / 

Consider  the  ball  Ena  with  center  U„a  and  radius  5  >  0.  This  ball  lies  in  a  compact  set 
with  probability  exceeding  (1  -  e)  for  n  >  /i0;  actually,  for  t  e  B„a, 

||t||  <  ||t-U„a||  +  ||U„a||  <  8  +  K !  (log2«)//2 

for  some  ^i  with  probability  exceeding  1  -  e  for  n  >  riQ.  Hence,  by  (3.10), 

P 
Ana=        sup  sup    |r„(t,  a)|  -»0.  (3.41) 

a^  <  a<  1  -a*    teB„tt 

Following  Pollard  (1991),  consider  the  behavior  of  G„a(t)  outside  B„a.  Suppose 
ta  =  U„a  +  kt,,  k  >  8  and  ||£J|  =  1.  Let  t«  be  the  boundary  point  of  B„a  that  lies  on  the 
line  from  U„a  to  ta,  i.e.,  t«  =  U„a  +  5^.  Then  t«  =  (1  -  (S/k))Una  +(8/£)ta  and  hence, 
by  (3.38)  and  (3.39), 

S/kGna(t)  +  (\-b/k)Gna(\]na)  >  Gna(C)  >  V2b2Xo  +  G„a(lU)  -  2A„a 

where  Xq  is  the  minimal  eigenvalue  of  D.  Hence, 

inf     G„a(t)>  Gna(Una)  +  (k/d)(V2b2Xo  -  2Ana).  (3  42) 

Using  (3.39)  the  last  term  is  positive  with  probability  tending  to  one  uniformly  in  a  for 
any  fixed  8  >  0.  Hence,  given  5  >  0  and  e  >  0,  there  exist  hq  and  rj  >  0  such  that  for 
n  >n0, 

P{        inf        [     inf     G„a(t)-Gna(U„a)l>Ti)>  1-e  (3  43) 

a„  <  a  <  1  -  a;  ||t-Una||>8  K^-^J) 

and  hence  (since  the  event  in  (3.43)  implies  that  Gna  must  be  minimized  inside  the  ball 

of  radius  8)  P  (        sup       <  l|T„a  -  U„a||  <  8)  — >  1  for  any  fixed  8  >  0,  as  n  — >  °°.      □ 
a"  <  a  <  1  -  a* 
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The  following  theorem  approximates  the  regression  rank  score  process  by  an  empir- 
ical process. 


THEOREM  3.2.    Let  d„  satisfy  (D.l)  -  (D.3),  X„  satisfy  (X.l)  -  (X.4)  and  F  satisfy 
(F.1)-(F.4).  Then 

(    sup      ,{|n-ly2(o(l-a)rwf;4l/(aII1-(a)-ai(a))|}^0  (3.44) 


an<a<l-a„  (=1 


as  n  —¥°°,  where 


ii(a)=I[Ei>F-l(a)l    i=l,...,/i.  (3.45) 


1/2--1. 


PROOF.    Insert  n  1/za«  (M«)  ~  P(«))  for  t  in  (3.26)  and  notice  (3.29)  and  the  fact  that 
sup    (  {«-1/2(a(l-a)r1/2X4i/[^  =  x</P(a)]}  ->"o,  (3.46) 

a„<a<l-a„  (=1 

from  which  (3.44)  follows.       D 

The  following  theorem  which  follows  from  Theorem  3.2  is  an  extension  of 
Theorem  V.3.5  in  Hajek  and  Sidak  (1967)  to  the  regression  rank  scores.  Some  applica- 
tions of  this  result  to  Kolmogorov-Smirnov  type  tests  appears  in  Jureckova  (1991). 

THEOREM  3.3.  Under  the  conditions  of  Theorem  3.2,  as  n  — >  °°, 

n  P 

sup   { I  n  m  £<4, (a„,(a)  -  ara(a))  | }  ->  0  (3.47) 

0<a<l  /=1 

Moreover,  the  process 

{A-1/*-172  i^,aw(a) :  0  <  a  <  1 }  (3.48) 

converges  to  the  Brownian  bridge  in  the  Prokhorov  topology  on  C[0,  1]. 
PROOF.    By  Theorem  3.2, 

n  P 

sup      |n-1/2X4/(a«/(a)-a„/(a))|  ->0.  (3.49) 

a;<a<l-a;  /=1 

n 

Further,  using  the  fact  that  £(l  -  am(a))  =  na,  due  to  the  linear  constraints  in  (1.5), 

(=1 

sup     |fl-1/2fymam(a)|=    sup    |/r1/2£flWl  -a„,(a))|  <nv2  max  \dni  \an 

0<a<a;  ;=i  0<a<a;  /=1  \<i<n 


=  0 


Kb-ays     _±^ 


[+4b  1446 


=  0(n~26)  (3.50) 


n 

,-1/2 


and  we  obtain  an  analogous  conclusion  for      sup      \n       ^dmani(a)\ .   On  the  other 

l-a;<a<l  /=1 

hand, 
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sup    |«-1/2X^m5/(a)  I  =    sup    I  A2"1/2  £«/[£,  <  Z7"1  (a)]  -  a)  | 

0<a<aB  /=1  0<a<an  /=1 

>l/2 


<  max  \dni\  -  Op(an(l-an)yu  =op(\)  (3.51) 


and  analogously 


,-1/2 


sup       \n-'LYddnili(a)\  =o  (1). 
l-a„<a<l  (=1 


Thus  (3.47)  follows,  and  consequently  (3.48).       □ 


4.      Asymptotic  properties  of  simple  linear  regression  rank  scores  statistics 

Maintaining  the  notation  of  Section  3,  let  (g(0  :  0  <  t  <  1  be  a  nondecreasing 
square-integrable  score-generating  function  and  let  /?„,,  i=l,...,/i  be  the  scores  defined  by 
(2.7).  Let  {d„}  be  a  sequence  of  vectors  satisfying  (D.l)  -  (D.3)  .  Following  Hajek  and 
Sidak  (1967),  we  shall  call  the  statistics 

Sn=n   u"^dnibni  (4.1) 

simple  linear  regression  rank-score  statistics,  or  just  simple  linear  rank  statistics.  Our 
primary  objective  in  this  section  is  to  investigate  the  conditions  on  9  under  which  we 
may  integrate  (3.47)  and  obtain  an  asymptotic  representation  for  Sn  of  the  form 

Sn  =  n~m  ZdnMF(Ei))  +  op{\).  (4.2) 

1=1 

We  shall  prove  (4.2)  for  9  satisfying  a  condition  of  the  Chernoff-Savage(1958)  type; 
thus  our  results  will  cover  Wilcoxon,  van  der  Waerden  (Normal),  and  median  scores, 
among  others. 

THEOREM  4.1.  Let  cp(r)  :  0  <  t  <  1,  be  a  nondecreasing  square  integrable  function 
such  that  q>'(0  exists  for  0  <  t  <  Oq,   1-oco  <  t  <  1  and  satisfies 

\y'(t)\<c(t(l-t))-1-5'  (4.3) 

for  some  5  <  5  where  5  is  given  in  condition  (X.4),  and  for  t  e  (0,  oto)  U  (1-OCq,  1). 
Then,  under  (F.l)  -  (F.4),  (X.l)  -  (X.4)  and  (D.l)  -  (D.3),  the  statistic  Sn  admits  the 
representation  (4.2)  and  hence  is  asymptotically  normally  distributed  with  zero  expecta- 
tion and  with  variance 

1  1 

A2(jy2(t)dt-y-\    q>  =  Jcp(0^.  (4.4) 

0  0 

PROOF.  Let  us  consider  S„  defined  in  (4.1)  with  the  scores  (2.7).  Integrating  by  parts 
(notice  that  a„,(r)  -  a,(r)  =  0  for  /  =  0,  1),  we  obtain 
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n  \  n  1 

-n~m  J^dni  J  cp(r)d(a„,(0  -  a,(0)  =  "_1/2  Z^m  |(a„/(0  -  5,(0  W) ,         (4.5) 

i=l        0  /'=1        0 

which  we  must  show  is  op{\).  We  shall  split  the  domain  of  integration  into  the  intervals 
(0,  a*],    (a*,  ao),  [ocq,  1-oco],    (1-cto,  1— a^),  [\-an,  1)   and  denote   the   respective 

integrals  by  I \,  ...  J 5.  Regarding  Theorem  3.2,  we  immediately  get  that  73  — >  0  by  the 
dominated  convergence  theorem.  Similarly,  for  some  8    >  l/i, 

«o  n 

|/2 1  <  J  |q>'(OI  l«"1/2S^/(a„/(0-a/(0)l^ 


a0 


<cj  (r(l-r))-1-8  (r(l-r))1/2-  |/i"1/2(r(l-r)r1/25:rf*(a«(0-a/(0)|A 


a. 


«0 


1=1 


=  cJ(/(W))-5-1/2£/r-op(l)  =  0^,(1). 


Finally, 


where 


and 


Then 


\I{\  <Aj-1/2max  \dni\  J  |<p'(r)|  X  |£ni(f)-5„,(0l^ ^11  +h: 


\<i<n 


«=1 


/„  =«"1/2  max  K,|  J  |cp'(f)|  2(1-8^(0)* 


\<i<n 


«=1 


/12=AI"1/2max  K,-|    f  |cp'(OI  Id-Wr. 


<*„ 


/„  <ai1/2  max  K,|    f  rb'dt=0(n 

0 


1/2  + 


2(fr-a)-5       (1-8') 

1+46  1+46    )  =  (9(ai"2(5_5*)). 


!</<« 


Finally, 


(4.6) 


(4.7) 


/i2  =  ""1/25X;  j  <p'W/[f  >F(3BjP  =  »"lflI<«1ifti(a!)-^WCr(B/)  <  a«] 
(=1      0  /=i 

Now  we  may  assume  that  (p(a*)  <  0  for  n  >  n0,  since  otherwise  if  (p  were  bounded  from 
p 

below  then  /12  -»  0.  Hence 

Var(/12)<Ai-1X^^([2(p(/r(E())]2/[F(£()<a;])<  J  f(M)^O(l)^0 
/=i  0 
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due  to  the  square-integrability  of  (p.  Treating  the  integrals  /4,  /5  analogously,  we  arrive 
at  (4.5)  and  this  proves  the  representation  (4.2).       □ 


5.      Tests  of  linear  subhypotheses  based  on  regression  rank  scores 

Returning  to  the  model  (2.2),  assume  that  the  design  matrix  X  =  (Xi  ■  X2)  satisfies 
the  conditions  (X.l)  -  (X.4),  (2.3)  and  (2.4).  We  want  to  test  the  hypothesis 
Hq  :  (32  =  0  (p!  unspecified)  against  the  alternative  Hn  :  (3^  =  i_1/2Po  (Po  e  Rq  fixed). 

Let  a„(cc)  =  (a„i(ct),  ...,  a„„(a))  denote  the  regression  rank  scores  corresponding  to 
the  submodel 

Y  =  X1p1+E      undergo.  (5.1) 

Let  (p(r)  :  (0,  1)  — >  R  x  be  a  nondecreasing  and  square  integrable  score-generating  func- 
tion. Define  the  scores  bn-„  i=\,...,n  by  the  relation  (2.7),  and  consider  the  test  statistic 

Tn=Sn'Q-nlSn/A2(y)  (5.2) 

where 

Sn=n-m(Xn2-Xn2ybn  (5.3) 

and  where  Q„  and  A  (<p)  are  defined  in  (2.4)  and  (2.10),  respectively.  The  test  is  based 
on  the  asymptotic  distribution  of  Tn  under  H0,  given  in  the  following  theorem.  Thus,  we 
shall  reject  Ho  provided  Tn  >  %^(co),  i.e.  provided  Tn  exceeds  the  co  critical  value  of  the 
X"  distribution  with  q  d.f.  The  same  theorem  gives  the  asymptotic  distribution  of  Tn 
under  Hn  and  thus  shows  that  the  Pitman  efficiency  of  the  test  coincides  with  that  of  the 
classical  rank  test. 


THEOREM  5.1.  Assume  that  X!  satisfies  (X.l)  -  (X.4)  and  (X!  \  X2)  satisfies  (2.3)  and 
(2.4).  Further  assume  that  F  satisfies  (F.l)  -  (F.4).  Let  Tn  defined  in  (5.3)  and  (5.4)  be 
generated  by  the  score  function  (p  satisfying  (4.3),  and  nondecreasing  and  square- 
integrable  on  (0,  1). 

(i)  Then,  under  Hq,  the  statistic  Tn  is  asymptotically  central  x~  w^  Q  degrees  of  free- 
dom. 

(ii)  Under  Hn,  Tn  is  asymptotically  noncentral  %~  with  q  degrees  of  freedom  and  with 
noncentrality  parameter, 

Tl2  =  (VQlW(<P.  W2(<P)  (5-4) 

with 

l 
y(^F)  =  -j<v(t)df(F-l(t)).  (5.5) 

o 

REMARKS. 

(i)  If  cp  is  of  bounded  variation  and  is  constant  near  0  and  1,  the  representation  given  in 
Theorem  2  (ii)  of  Gutenbrunner  and  Jureckova  (1992)  could  be  used  to  provide  the 
conclusion  of  Theorem  5.1  under  somewhat  weaker  hypotheses;  namely,  (X.l), 
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(X.2),    max,  \\Xj\\  =o(n    ~),  F  has   finite   Fisher   Information,   and   0</<°°   on 
{0<F  <  1}. 

(ii)  The  analogy  between  the  location  and  regression  models  concerning  the  noncentral- 
ity  parameter  y((p,  F)  may  be  extended  in  the  following  way:  instead  of  defining 
local  alternatives  via  (2.6),  the  definition  of  Behnen  (1972)  can  be  generalized  to  the 
regression  model.  That  is,  with  F/(f)  =  F(t  - JCi,'Pi)  and  G,  =  L(F,),  consider 

H0  :  C,  =Fi       vs.       Hn  :  — f  =  1  +x2l'?>2nhn(Fi) 


dHi 


where 


p2«=""1/2Po,    hn^heL2(0,  1),    and    max,-  ||  x2i  \\  \\  hn\\L  =o(n  1/2). 

In  this  setting,  even  without  the  assumption  of  finite  Fisher  Information,  (4.2) 
implies  that  the  conclusion  of  Theorem  5. 1  holds  with  y((p,  F)  in  (5.4)  replaced  by 
the  F  -independent  constant 

f((p(w)-(p)(/i(w)-Ww 
Y  (<P»  h)  = 


(\((?(u)  -  q)2du\(h(u)  -  h)2du)l/2  ' 

i.e.,  the  correlation  of  the  functions  (p  and  h.  Such  local  alternatives  provide  insight 
into  the  structure  of  the  regions  of  constant  efficiency  for  regression  rank  tests. 

PROOF. 

(i)     It  follows  from  Theorem  4.1  that,  under  Ho,  S„  has  the  same  asymptotic  distribu- 
tion as 

Sn=«-1/2(X„2-X,2)'bn 

where  b„  =  (bni,...,bnnY  and  bni  =  (p(F (£,)),  i=l,...,n.  The  asymptotic  distribution 
of  S„  follows  from  the  central  limit  theorem  and  coincides  with  ^-dimensional  nor- 
mal distribution  with  expectation  0  and  the  covariance  matrix  Q  ■  A  ~((p). 

(ii)    The  sequence  of  local  alternatives  Hn  is  contiguous  with  respect  to  the  sequence  of 

n 

null  distributions  with   the  densities  {  FI/(^,)}.   Hence,  (4.1)  holds  also  under  Hn 

_    /=i 
and  the  asymptotic  distributions  of  S„  under  Hn  coincide.  The  proposition  then  fol- 
lows from  the  fact  that  the  asymptotic  distribution  of  S„   under  Hn   is  normal 
JV7«P.  JOQPo.  CM2(<P))-  □ 
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